1,991 research outputs found
Mental distress detection and triage in forum posts: the LT3 CLPsych 2016 shared task system
This paper describes the contribution of LT3 for the CLPsych 2016 Shared Task on automatic triage of mental health forum posts. Our systems use multiclass Support Vector Machines (SVM), cascaded binary SVMs and ensembles with a rich feature set. The best systems obtain macro-averaged F-scores of 40% on the full task and 80% on the green versus alarming distinction. Multiclass SVMs with all features score best in terms of F-score, whereas feature filtering with bi-normal separation and classifier ensembling are found to improve recall of alarming posts
Economic event detection in company-specific news text
This paper presents a dataset and supervised classification approach for economic event detection in English news articles. Currently, the economic domain is lacking resources and methods for data-driven supervised event detection. The detection task is conceived as a sentence-level classification task for 10 different economic event types. Two different machine learning approaches were tested: a rich feature set Support Vector Machine (SVM) set-up and a word-vector-based long short-term memory recurrent neural network (RNN-LSTM) set-up. We show satisfactory results for most event types, with the linear kernel SVM outperforming the other experimental set-ups
SENTiVENT Event Annotation Guidelines v1.1
Annotation Guidelines for economic Events in the SENTiVENT project for economic news text mining.
The goal of this annotation scheme is to produce a gold-standard labeled dataset for enabling supervised event extraction in the company-specific news text domain. The guidelines are based on the [Rich-ERE Guidelines][1] for [Events][2] and [Argument Fillers][3] but adapted to a corpus of business and financial news articles. We exclusively annotate event structures, unlike Rich ERE which annotates Entities and Relations separately
Extracting fine-grained economic events from business news
Based on a recently developed fine-grained event extraction dataset for the economic domain, we present in a pilot study for supervised economic event extraction. We investigate how a state-of-the-art model for event extraction performs on the trigger and argument identification and classification. While F1-scores of above 50{%} are obtained on the task of trigger identification, we observe a large gap in performance compared to results on the benchmark ACE05 dataset. We show that single-token triggers do not provide sufficient discriminative information for a fine-grained event detection setup in a closed domain such as economics, since many classes have a large degree of lexico-semantic and contextual overlap
Production of human recombinant proapolipoprotein A-I in Escherichia coli: purification and biochemical characterization
A human liver cDNA library was used to isolate a clone coding for apolipoprotein A-I (Apo A-I). The clone
carries the sequence for the prepeptide (18 amino acids), the propeptide (6 amino acids), and the mature protein
(243 amino acids). A coding cassette for the proapo A-I molecule was reconstructed by fusing synthetic
sequences, chosen to optimize expression and specifying the amino-terminal methionine and amino acids -6
to +14, to a large fragment of the cDNA coding for amino acids 15-243. The module was expressed in
pOTS-Nco, an Escherichia coli expression vector carrying the regulatable X P^ promoter, leading to the production
of proapolipoprotein A-I at up to 10% of total soluble proteins. The recombinant polypeptide was
purified and characterized in terms of apparent molecular mass, isoelectric point, and by both chemical and
enzymatic peptide mapping. In addition, it was assayed in vitro for the stimulation of the enzyme lecithin:
cholesterol acyltransferase. The data show for the first time that proapo A-I can be produced efficiently in
E. coli as a stable and undegraded protein having physical and functional properties indistinguishable from
those of the natural product
Current Limitations in Cyberbullying Detection: on Evaluation Criteria, Reproducibility, and Data Scarcity
The detection of online cyberbullying has seen an increase in societal
importance, popularity in research, and available open data. Nevertheless,
while computational power and affordability of resources continue to increase,
the access restrictions on high-quality data limit the applicability of
state-of-the-art techniques. Consequently, much of the recent research uses
small, heterogeneous datasets, without a thorough evaluation of applicability.
In this paper, we further illustrate these issues, as we (i) evaluate many
publicly available resources for this task and demonstrate difficulties with
data collection. These predominantly yield small datasets that fail to capture
the required complex social dynamics and impede direct comparison of progress.
We (ii) conduct an extensive set of experiments that indicate a general lack of
cross-domain generalization of classifiers trained on these sources, and openly
provide this framework to replicate and extend our evaluation criteria.
Finally, we (iii) present an effective crowdsourcing method: simulating
real-life bullying scenarios in a lab setting generates plausible data that can
be effectively used to enrich real data. This largely circumvents the
restrictions on data that can be collected, and increases classifier
performance. We believe these contributions can aid in improving the empirical
practices of future research in the field
Dynamic effects in nonlinear magneto-optics of atoms and molecules
A brief review is given of topics relating to dynamical processes arising in
nonlinear interactions between light and resonant systems (atoms or molecules)
in the presence of a magnetic field.Comment: 15 pages, 11 figure
- …